Predicting and Using Implicit Discourse Elements in Chinese-English Translation
نویسندگان
چکیده
In machine translation (MT) implicitation can occur when elements such as discourse markers and pronouns are not expected or mandatory in the source language, but need to be realised in the target language for a coherent translation. These ‘implicit’ elements can be seen as both a barrier to MT and an important source of information. However, identifying where such elements are needed and producing them are non-trivial tasks. In this paper we examine the effect of implicit elements on MT and propose methods to identify and make them explicit. As a starting point, we use human translated and aligned data to decide where to insert place holders for these elements. We then fully automate this process by devising a prediction model to decide if and where implicit elements should occur and be made explicit. Our experiments compare statistical machine translation models built with and without these explicitation processes. Models built on data marked for discourse elements show substantial improvements over the baseline.
منابع مشابه
The Use of Second-Person Reference in Advertisement Translation with Reference to Translation between Chinese and English
This research aimed to review the use of second-person reference in advertisement translation, work out the general rules, and provide guidance to translators. Using second-person reference is common in the advertising discourse. Addressing audiences directly involves their attention and in this way enhances their memorization of the advertised message. Second-person reference can be realized v...
متن کاملTowards a discourse relation-aware approach for Chinese-English machine translation
Translation of discourse relations is one of the recent efforts of incorporating discourse information to statistical machine translation (SMT). While existing works focus on disambiguation of ambiguous discourse connectives, or transformation of discourse trees, only explicit discourse relations are tackled. A greater challenge exists in machine translation of Chinese, since implicit discourse...
متن کاملCrosslingual Annotation and Analysis of Implicit Discourse Connectives for Machine Translation
Usage of discourse connectives (DCs) differs across languages, thus addition and omission of connectives are common in translation. We investigate how implicit (omitted) DCs in the source text impacts various machine translation (MT) systems, and whether a discourse parser is needed as a preprocessor to explicitate implicit DCs. Based on the manual annotation and alignment of 7266 pairs of disc...
متن کاملA Linguistic Study on the Translation of Parvin E’tesami’s Poems into English Using Catford’s Category Shifts
The present study aimed to investigate the translation into English by Alaeddin Pazargadi of Parvin E’tesami’s poems; in particular, it attempted to analyze the structural elements such as verbs, nouns, pronouns, adjectives, adverbs, articles, conjunctions, prepositions, and interjections in them. Considering the relationship between Linguistics and Translation Studies, the theoretical framewor...
متن کاملImproving the Translation of Discourse Markers for Chinese into English
Discourse markers (DMs) are ubiquitous cohesive devices used to connect what is said or written. However, across languages there is divergence in their usage, placement, and frequency, which is considered to be a major problem for machine translation (MT). This paper presents an overview of a proposed thesis, exploring the difficulties around DMs in MT, with a focus on Chinese and English. The ...
متن کامل